Rn/na-notebook #578

rnayebi21 · 2024-12-07T00:54:45Z

Checklist

Please:

Make sure this PR is against "dev", not "main" (unless this is a release
PR).
Request a review from one of the current main reviewers:
brookslogan, nmdefries.
Makes sure to bump the version number in DESCRIPTION. Always increment
the patch version number (the third number), unless you are making a
release PR from dev to main, in which case increment the minor version
number (the second number).
Describe changes made in NEWS.md, making sure breaking changes
(backwards-incompatible changes to the documented interface) are noted.
Collect the changes under the next release number (e.g. if you are on
1.7.2, then write your changes under the 1.8 heading).
See DEVELOPMENT.md for more information on the development
process.

Change explanations for reviewer

Added the reviewed na-notebook. Key differences from last time: added an example and explained complete(), as well as added the LOCF in version example.

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

Resolves #{issue number}

removing some lint and adding zoo to DESCRIPTION

brookslogan

Comments so far. Will revisit.

brookslogan · 2024-12-18T00:16:17Z

vignettes/na-notebook.Rmd

+impute_locf <- function(data) {
+  data %>%
+    group_by(geo_value) %>%
+    mutate(across(where(is.numeric), ~ zoo::na.locf(.x, na.rm = FALSE), .names = "{.col}_locf")) %>%


todo: use fill from tidyr to do this instead, to focus on a smaller set of packages. I guess we should probably include an arrange to make sure things are arranged by time_value within each geo_value as well.

brookslogan · 2024-12-18T00:21:58Z

vignettes/na-notebook.Rmd

+
+### NAs from merging
+
+First let's start with discussing the most common type of missing values that appeared in the context of my auxiliary signal project. When working with multiple signals each signal will likely begin recording at different times. In other words each signal's first data point $t_0$ will differ on the absolute time scale. As a result, when calling `epix_merge()` to combine multiple signals, the signals that started recording at a later point in time will have missing values for the time periods where the other signals were already recording. Here's a quick example


todo: let's present this as more instructional. Here, that means just tweaking language, getting rid of "that appeared in the context of my auxiliary signal project", and then softening/rewording "the most common" since it may not be the most common in general.

(But in other parts we actually need to make it the content more instructional/relevant (use real data not buggy/artificial).)

issue: epix_merge() can introduce NAs both from differing min time_values and differing min versions, but

the description here is a little ambiguous about which it's referring to

the table is showing something more like an epi_df

Possible fix: instead of mentioning epix_merge() at this point, we could say this is an issue with some types of joins, and then transition into the epix_merge() example with some more discussion.

brookslogan · 2024-12-18T00:29:55Z

vignettes/na-notebook.Rmd

+
+```{r latest_fn}
+latest <- function(x) {
+  epix_as_of(x, max_version = max(x$versions_end))


Suggested change

epix_as_of(x, max_version = max(x$versions_end))

epix_as_of(x, x$versions_end)

we renamed max_version, and versions_end is a scalar

brookslogan · 2024-12-18T00:50:58Z

vignettes/na-notebook.Rmd

+  geo_type = "state",
+  time_values = epirange(20200220, today),
+  geo_values = states,
+  issues = epirange(20201130, today)


suggest:
move to time_values = "*", issues = epirange(12340101, today) or some other absurdly early start issue

and then maybe do something else to show people the range of time values and issues in the data

rnayebi21 and others added 2 commits December 6, 2024 09:03

Adding a draft of na-notebook vignette, w logan's suggestions

b9d4226

style: styler (GHA)

98620f7

rnayebi21 requested a review from brookslogan December 7, 2024 00:54

rnayebi21 and others added 6 commits December 8, 2024 19:14

Adding na-notebook vignette to pkgdown articles

46106fd

removing dependencies

daf90f9

style: styler (GHA)

7d12544

fixing some linting and adding zoo to suggests: section of DESCRIPTION

64af906

removing some lint and adding zoo to DESCRIPTION

more lint removal

48ee517

adding no lint comments

f67e325

brookslogan reviewed Dec 18, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rn/na-notebook #578

Rn/na-notebook #578

rnayebi21 commented Dec 7, 2024 •

edited

Loading

brookslogan left a comment

brookslogan Dec 18, 2024

brookslogan Dec 18, 2024

brookslogan Dec 18, 2024

brookslogan Dec 18, 2024

brookslogan Dec 18, 2024

brookslogan Dec 18, 2024


		### NAs from merging

		First let's start with discussing the most common type of missing values that appeared in the context of my auxiliary signal project. When working with multiple signals each signal will likely begin recording at different times. In other words each signal's first data point $t_0$ will differ on the absolute time scale. As a result, when calling `epix_merge()` to combine multiple signals, the signals that started recording at a later point in time will have missing values for the time periods where the other signals were already recording. Here's a quick example

	epix_as_of(x, max_version = max(x$versions_end))
	epix_as_of(x, x$versions_end)

Rn/na-notebook #578

Are you sure you want to change the base?

Rn/na-notebook #578

Conversation

rnayebi21 commented Dec 7, 2024 • edited Loading

Checklist

Change explanations for reviewer

Magic GitHub syntax to mark associated Issue(s) as resolved when this is merged into the default branch

brookslogan left a comment

Choose a reason for hiding this comment

brookslogan Dec 18, 2024

Choose a reason for hiding this comment

brookslogan Dec 18, 2024

Choose a reason for hiding this comment

brookslogan Dec 18, 2024

Choose a reason for hiding this comment

brookslogan Dec 18, 2024

Choose a reason for hiding this comment

brookslogan Dec 18, 2024

Choose a reason for hiding this comment

brookslogan Dec 18, 2024

Choose a reason for hiding this comment

rnayebi21 commented Dec 7, 2024 •

edited

Loading